Skip to content

feat: reload config on SIGHUP#420

Open
blob42 wants to merge 1 commit intomostlygeek:mainfrom
blob42:feat-sigusr1-reload
Open

feat: reload config on SIGHUP#420
blob42 wants to merge 1 commit intomostlygeek:mainfrom
blob42:feat-sigusr1-reload

Conversation

@blob42
Copy link
Copy Markdown

@blob42 blob42 commented Dec 1, 2025

  • Allows quick reloading with SIGHUP without when filesytem watch is
    not available like on a default docker setup docker kill -s SIGHUP
  • Also refactored the signal handling in a single logical unit
  • Updated the config readme and the script for reloading on changes

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 1, 2025

Walkthrough

Main now listens for OS signals in a loop, handling SIGHUP to reload configuration via reloadProxyManager() and SIGINT/SIGTERM for graceful shutdown. Documentation and the example watcher were updated to instruct sending SIGHUP (or Docker equivalent) instead of restarting the process.

Changes

Cohort / File(s) Summary
Documentation
docs/configuration.md
Added "Hot Reloading Configuration" describing triggering a reload via SIGHUP and a Docker example.
Example watcher / README
docs/examples/restart-on-config-change/README.md
Changed watcher to send SIGHUP for reload, updated workflow/messages, added inotifywait failure handling and process-existence checks, and a Docker hint to use docker kill -s SIGHUP.
Signal handling / runtime
llama-swap.go
Switched to a for-range signal loop; added syscall.SIGHUP handling to call reloadProxyManager(); preserved SIGINT/SIGTERM graceful shutdown path and added logging for unhandled signals.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • mostlygeek
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: reload config on SIGHUP' directly and clearly summarizes the main change: adding SIGHUP signal support for configuration reloading.
Description check ✅ Passed The description explains the key motivation (SIGHUP reload for Docker environments), mentions signal handling refactoring, and notes documentation updates, all of which align with the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/examples/restart-on-config-change/README.md (1)

16-34: Script no longer watches for multiple config changes after switching to SIGUSR1

With kill -SIGUSR1 $PID the llama-swap process keeps running, so the subsequent wait $PID blocks until the process eventually exits. That means:

  • The first config change triggers a reload via SIGUSR1.
  • Further changes are not detected because the loop doesn’t reach the next inotifywait until the process dies.

This is at odds with the description that the script “can be used to watch config.yaml for changes and send the SIGUSR1 signal … when it detects a change.”

Consider restructuring so the process is started once and the loop only waits on inotify + sends SIGUSR1 while the PID is alive, for example:

-while true; do
-    # Start the process again
-    ./llama-swap-linux-amd64 -config $1 -listen :1867 &
-    PID=$!
-    echo "Started llama-swap with PID $PID"
-
-    # Wait for modifications in the specified directory or file
-    inotifywait -e modify "$1"
-
-    # Check if process exists before sending signal
-    if kill -0 $PID 2>/dev/null; then
-        echo "Sending SIGUSR1 to $PID"
-        kill -SIGUSR1 $PID
-        wait $PID
-    else
-        echo "Process $PID no longer exists"
-    fi
-    sleep 1
-done
+# Start the process once
+./llama-swap-linux-amd64 -config "$1" -listen :1867 &
+PID=$!
+echo "Started llama-swap with PID $PID"
+
+while kill -0 "$PID" 2>/dev/null; do
+    # Wait for modifications in the specified file
+    inotifywait -e modify "$1"
+
+    if kill -0 "$PID" 2>/dev/null; then
+        echo "Sending SIGUSR1 to $PID"
+        kill -SIGUSR1 "$PID"
+    else
+        echo "Process $PID no longer exists"
+        break
+    fi
+done
+
+wait "$PID"
🧹 Nitpick comments (2)
docs/configuration.md (1)

387-397: Hot‑reload section is clear and consistent with implementation

The SIGUSR1 instructions and Docker example match the new signal handling behavior and the linked example README; this looks good. If you expect many Windows users, consider a brief note that SIGUSR1 is POSIX‑only and that Windows should rely on -watch-config or other mechanisms.

llama-swap.go (1)

80-81: SIGUSR1 handling and signal loop look good; check Windows build and reload concurrency

The refactored signal loop cleanly separates shutdown (SIGINT/SIGTERM) from reload (SIGUSR1), and the logic matches the docs and example output. A couple of follow‑ups to consider:

  • Cross‑platform build: This file has no build tags in the snippet. If it’s compiled on Windows as well, syscall.SIGUSR1 may not be defined there, which would break Windows builds. If that’s the case, either:
    • Add an appropriate build tag (e.g. //go:build !windows) to this file, or
    • Guard SIGUSR1 registration behind a platform check / separate Unix‑only file.
  • Reload serialization (optional): reloadProxyManager can now be triggered both via fsnotify (-watch-config) and SIGUSR1. If those are ever used together, you could see concurrent reload calls mutating srv.Handler and conf. If that scenario is realistic for your users, consider serializing reloads (e.g. via a mutex or a single “reload” goroutine) to avoid subtle races.

Please verify that:

  • Cross‑platform builds (especially GOOS=windows) still compile successfully with syscall.SIGUSR1 in this file.
  • You’re comfortable with the current concurrency story for reloadProxyManager or want to guard it with a mutex.

Also applies to: 173-201

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dea9873 and e5fa388.

📒 Files selected for processing (3)
  • docs/configuration.md (1 hunks)
  • docs/examples/restart-on-config-change/README.md (3 hunks)
  • llama-swap.go (2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-05T22:57:53.793Z
Learnt from: mostlygeek
Repo: mostlygeek/llama-swap PR: 155
File: proxy/process.go:359-372
Timestamp: 2025-06-05T22:57:53.793Z
Learning: In Go, cmdWaitChan in proxy/process.go is a signal channel (chan struct{}) that follows the pattern where waitForCmd() closes the channel when cmd.Wait() returns, and stopCommand() waits on it. Reading from a closed channel returns immediately, so <-cmdWaitChan won't block indefinitely.

Applied to files:

  • llama-swap.go
📚 Learning: 2025-10-15T12:15:22.706Z
Learnt from: tesurijp
Repo: mostlygeek/llama-swap PR: 350
File: llama-swap-windows.go:17-17
Timestamp: 2025-10-15T12:15:22.706Z
Learning: In the llama-swap project, the `listenStr` flag format includes the colon prefix (e.g., `:8080`), so when constructing URLs, it can be directly concatenated with `http://localhost` to produce the correct format like `http://localhost:8080`.

Applied to files:

  • docs/examples/restart-on-config-change/README.md
🧬 Code graph analysis (1)
llama-swap.go (1)
proxy/proxymanager.go (1)
  • ProxyManager (30-53)
🪛 LanguageTool
docs/examples/restart-on-config-change/README.md

[grammar] ~3-~3: Ensure spelling is correct
Context: ...he configuration file can take a bit of trail and error to get a model configuration ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests
🔇 Additional comments (1)
docs/examples/restart-on-config-change/README.md (1)

39-48: Updated sample output matches new reload behavior

The example log sequence (SIGUSR1 received, “Reloading configuration…”, “Configuration Changed”, “Configuration Reloaded”) aligns with the new signal handling and reloadProxyManager messages in llama-swap.go.

@blob42 blob42 marked this pull request as draft December 1, 2025 11:18
@blob42 blob42 force-pushed the feat-sigusr1-reload branch 3 times, most recently from f4ff658 to 169b3fb Compare December 1, 2025 13:49
@blob42 blob42 marked this pull request as ready for review December 1, 2025 13:49
@ryan-steed-usa
Copy link
Copy Markdown
Contributor

Nice idea, would it make more sense to re-appropriate SIGHUP like other popular daemons? SIGUSR1 and SIGUSR2 would then be available for future posibilities.

@blob42
Copy link
Copy Markdown
Author

blob42 commented Dec 1, 2025

@ryan-steed-usa I first started with SIGHUP as a quick patch but the docker kill command strangely terminated the container. Let me try again maybe I overlooked something.

EDIT: I changed it to SIGHUP. On my first try I didn't compile properly :)

@blob42 blob42 force-pushed the feat-sigusr1-reload branch from 169b3fb to b38cfcb Compare December 1, 2025 17:27
@blob42 blob42 changed the title feat: reload config on SIGUSR1 feat: reload config on SIGHUP Dec 1, 2025
@blob42 blob42 force-pushed the feat-sigusr1-reload branch from b38cfcb to c9e0dfb Compare December 1, 2025 17:29
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
llama-swap.go (1)

88-119: Data race on srv.Handler during concurrent reload.

The srv.Handler field is accessed concurrently without synchronization:

  • Write access: Lines 100 and 117 assign a new ProxyManager to srv.Handler
  • Read access: The HTTP server reads srv.Handler.ServeHTTP() for each incoming request

In Go, unsynchronized concurrent read/write access to the same memory location constitutes a data race, even for pointer assignments. This can lead to undefined behavior, crashes, or serving requests with a partially-initialized handler.

Recommended solutions:

  1. Wrap srv.Handler in a synchronized wrapper (preferred for minimal disruption):
// Add near top of main():
type SafeHandler struct {
    mu      sync.RWMutex
    handler http.Handler
}

func (sh *SafeHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    sh.mu.RLock()
    h := sh.handler
    sh.mu.RUnlock()
    h.ServeHTTP(w, r)
}

func (sh *SafeHandler) Set(h http.Handler) {
    sh.mu.Lock()
    sh.handler = h
    sh.mu.Unlock()
}

Then modify the server setup and reloadProxyManager:

-srv := &http.Server{
-    Addr: *listenStr,
-}
+safeHandler := &SafeHandler{}
+srv := &http.Server{
+    Addr:    *listenStr,
+    Handler: safeHandler,
+}

And in reloadProxyManager:

-    if currentPM, ok := srv.Handler.(*proxy.ProxyManager); ok {
+    if sh, ok := srv.Handler.(*SafeHandler); ok {
+        sh.mu.RLock()
+        currentPM, isPM := sh.handler.(*proxy.ProxyManager)
+        sh.mu.RUnlock()
+        if isPM {
             // ... reload logic ...
-            srv.Handler = newPM
+            sh.Set(newPM)
+        }
     } else {
         // ... initial load ...
-        srv.Handler = newPM
+        if sh, ok := srv.Handler.(*SafeHandler); ok {
+            sh.Set(newPM)
+        }
     }
  1. Use atomic.Value for lock-free reads (alternative):
var handler atomic.Value  // stores http.Handler
srv.Handler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
    handler.Load().(http.Handler).ServeHTTP(w, r)
})
// In reloadProxyManager: handler.Store(newPM)
♻️ Duplicate comments (1)
docs/examples/restart-on-config-change/README.md (1)

3-3: Typo fixed: "trial and error" is now correct.

This addresses the previous review feedback.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 169b3fb and c9e0dfb.

📒 Files selected for processing (3)
  • docs/configuration.md (1 hunks)
  • docs/examples/restart-on-config-change/README.md (2 hunks)
  • llama-swap.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/configuration.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-05T22:57:53.793Z
Learnt from: mostlygeek
Repo: mostlygeek/llama-swap PR: 155
File: proxy/process.go:359-372
Timestamp: 2025-06-05T22:57:53.793Z
Learning: In Go, cmdWaitChan in proxy/process.go is a signal channel (chan struct{}) that follows the pattern where waitForCmd() closes the channel when cmd.Wait() returns, and stopCommand() waits on it. Reading from a closed channel returns immediately, so <-cmdWaitChan won't block indefinitely.

Applied to files:

  • llama-swap.go
🧬 Code graph analysis (1)
llama-swap.go (1)
proxy/proxymanager.go (1)
  • ProxyManager (30-53)
🔇 Additional comments (2)
llama-swap.go (2)

80-80: LGTM: SIGHUP added to signal notifications.

The signal is correctly registered for handling configuration reloads.


173-201: Signal handling loop correctly implements reload and shutdown.

The refactored signal handling properly distinguishes between:

  • Graceful shutdown on SIGINT/SIGTERM (with timeout and cleanup)
  • Hot reload on SIGHUP
  • Logging for unexpected signals

The loop structure ensures the goroutine exits after shutdown signals but continues processing after reload signals.

Comment on lines +52 to 58
...
Sending SIGHUP to 495455
Received SIGHUP. Reloading configuration...
Configuration Changed
Configuration Reloaded
...
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Output example shows correct SIGHUP flow but won't match current script.

The example output correctly demonstrates the SIGHUP reload sequence:

  • "Sending SIGHUP to [PID]"
  • "Received SIGHUP. Reloading configuration..."
  • "Configuration Changed"
  • "Configuration Reloaded"

However, this output won't occur with the current script because it sends SIGUSR1 instead of SIGHUP (see comment on lines 34-35). After fixing the signal in line 35, this example will be accurate.

🤖 Prompt for AI Agents
docs/examples/restart-on-config-change/README.md lines 52-58: the example output
shows a SIGHUP-based reload sequence but the actual example script sends SIGUSR1
(see comment/usage around line 34-35), so update the script to send SIGHUP
instead of SIGUSR1 (change the signal argument at line 35 to "SIGHUP"), then
re-run the example to ensure the README output matches runtime behavior.

@ryan-steed-usa
Copy link
Copy Markdown
Contributor

@blob42 I've tested this with Podman (via my forked franken-builds) and it works well. Thanks!

@ryan-steed-usa
Copy link
Copy Markdown
Contributor

ryan-steed-usa commented Dec 1, 2025

I still believe that this PR introduces useful functionality but it has helped me to understand why Docker/Podman with -watch-config didn't work as expected. The problem seems related to commit 6a058e4 and use of the example single-file volume mount -v /path/to/custom/config.yaml:/app/config.yaml. Using a directory mount like -v /path/to/custom/config:/app/config coupled with -config /app/config/config.yaml -watch-config works as desired with rootless Podman and root Docker! This is a good candidate for a reworked version of restart-on-config-change/README.md.

@mostlygeek
Copy link
Copy Markdown
Owner

mostlygeek commented Dec 2, 2025

The --watch-config works in docker if you volume mount a directory with the config file in it and use a custom entrypoint like:

docker run -v /path/to/config_dir:/config --entrypoint "/app/llama-swap" llama-swap:cpu -config /config/config.yaml

It is a problem with the fsnotify library I used that is able to detect changes for a directory but not a file when it is volume mounted.

Perhaps instead of listening for new signals a run.sh as a the entrypoint to handle this would be better.

@ryan-steed-usa
Copy link
Copy Markdown
Contributor

Overriding the entrypoint isn't necessary either since we already use the exec form of ENTRYPOINT:

ENTRYPOINT [ "/app/llama-swap", "-config", "/app/config.yaml" ]

Command line arguments to docker run will be appended after all elements in an exec form ENTRYPOINT, and will override all elements specified using CMD.

This allows arguments to be passed to the entry point, i.e., docker run -d will pass the -d argument to the entry point. You can override the ENTRYPOINT instruction using the docker run --entrypoint flag.

So a command like this works by appending the arguments:

docker run -v /path/to/config_dir:/config llama-swap:cpu -config /config/config.yaml -watch-config

It would help if this were to be mentioned in the docs.

@mostlygeek
Copy link
Copy Markdown
Owner

@ryan-steed-usa thanks. I forgot that with go’s flags package defining -config /config/config.yaml would over ride the one defined in the entrypoint.

I’ll be sure to update the docs as mounting a config directory is the workaround for making —watch-config work reliably.

- Allows quick reloading with SIGHUP without when filesytem watch is
  not available like on a default docker setup `docker kill -s SIGHUP`

- Also refactored the signal handling in a single logical unit

Signed-off-by: blob42 <contact@blob42.xyz>
@blob42 blob42 force-pushed the feat-sigusr1-reload branch from c9e0dfb to c7ca85e Compare March 25, 2026 15:34
@blob42
Copy link
Copy Markdown
Author

blob42 commented Mar 25, 2026

@mostlygeek I am not sure If you still plan to merge this PR but I just made a rebase in case you plan to merge it

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/examples/restart-on-config-change/README.md (1)

12-13: ⚠️ Potential issue | 🔴 Critical

Critical: reload signal is still wrong (SIGUSR1 instead of SIGHUP).

At Line 35, the script sends SIGUSR1, but the app reload path is wired to SIGHUP. So config reload won’t trigger. This also makes the guidance at Lines 12-13 and sample output at Lines 53-56 inconsistent with executable behavior.

Proposed fix
-# For docker users, consider replacing:
-# `kill -USR1 $PID` with `docker kill -s SIGHUP container_name`
+# For docker users, equivalent command:
+# `docker kill -s SIGHUP container_name`

@@
-        kill -USR1 $PID
+        kill -HUP $PID

Also applies to: 34-35, 52-57

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/examples/restart-on-config-change/README.md` around lines 12 - 13, The
README's example uses the wrong signal (SIGUSR1) while the app listens for
SIGHUP, so update all occurrences of SIGUSR1 to SIGHUP (including the docker
guidance, the script example around the reload command, and the sample output
lines) to make the documented command, the advice (`kill -SIGHUP` / `docker kill
-s SIGHUP`), and the shown output consistent with the app's reload path; ensure
every mention (e.g., the script send signal, the docker note, and the sample
output) is changed to SIGHUP.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/examples/restart-on-config-change/README.md`:
- Around line 12-13: The README's example uses the wrong signal (SIGUSR1) while
the app listens for SIGHUP, so update all occurrences of SIGUSR1 to SIGHUP
(including the docker guidance, the script example around the reload command,
and the sample output lines) to make the documented command, the advice (`kill
-SIGHUP` / `docker kill -s SIGHUP`), and the shown output consistent with the
app's reload path; ensure every mention (e.g., the script send signal, the
docker note, and the sample output) is changed to SIGHUP.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e519d971-8c0a-44cf-a90c-c2cb8ac57f8f

📥 Commits

Reviewing files that changed from the base of the PR and between c9e0dfb and c7ca85e.

📒 Files selected for processing (3)
  • docs/configuration.md
  • docs/examples/restart-on-config-change/README.md
  • llama-swap.go
✅ Files skipped from review due to trivial changes (1)
  • docs/configuration.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • llama-swap.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants